Gradient-Based Training of Gaussian Mixture Models for High-Dimensional Streaming Data
نویسندگان
چکیده
Abstract We present an approach for efficiently training Gaussian Mixture Model (GMM) by Stochastic Gradient Descent (SGD) with non-stationary, high-dimensional streaming data. Our scheme does not require data-driven parameter initialization (e.g., k-means) and can thus be trained based on a random initial state. Furthermore, the allows mini-batch sizes as low 1, which are typical streaming-data settings. Major problems in such settings undesirable local optima during early phases numerical instabilities due to high data dimensionalities. introduce adaptive annealing procedure address first problem, whereas eliminated exponential-free approximation standard GMM log-likelihood. Experiments variety of visual non-visual benchmarks show that our SGD completely without, instance, k-means centroid initialization. It also compares favorably online variant Expectation-Maximization (EM)—stochastic EM (sEM), it outperforms large margin very
منابع مشابه
Two-way Gaussian mixture models for high dimensional classification
Mixture discriminant analysis (MDA) has gained applications in a wide range of engineering and scientific fields. In this paper, under the paradigm of MDA, we propose a two-way Gaussian mixture model for classifying high dimensional data. This model regularizes the mixture component means by dividing variables into groups and then constraining the parameters for the variables in the same group ...
متن کاملGaussian mixture models for the classification of high-dimensional vibrational spectroscopy data
In this work, a family of generative Gaussian models designed for the supervised classification of high-dimensional data is presented as well as the associated classification method called High Dimensional Discriminant Analysis (HDDA). The features of these Gaussian models are: i) the representation of the input density model is smooth; ii) the data of each class are modeled in a specific subsp...
متن کاملHigh-Dimensional Clustering with Sparse Gaussian Mixture Models
We consider the problem of clustering high-dimensional data using Gaussian Mixture Models (GMMs) with unknown covariances. In this context, the ExpectationMaximization algorithm (EM), which is typically used to learn GMMs, fails to cluster the data accurately due to the large number of free parameters in the covariance matrices. We address this weakness by assuming that the mixture model consis...
متن کاملRegularized Parameter Estimation in High-Dimensional Gaussian Mixture Models
Finite gaussian mixture models are widely used in statistics thanks to their great flexibility. However, parameter estimation for gaussian mixture models with high dimensionality can be challenging because of the large number of parameters that need to be estimated. In this letter, we propose a penalized likelihood estimator to address this difficulty. The [Formula: see text]-type penalty we im...
متن کاملHigh dimensional Sparse Gaussian Graphical Mixture Model
This paper considers the problem of networks reconstruction from heterogeneous data using a Gaussian Graphical Mixture Model (GGMM). It is well known that parameter estimation in this context is challenging due to large numbers of variables coupled with the degenerate nature of the likelihood. We propose as a solution a penalized maximum likelihood technique by imposing an l1 penalty on the pre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Processing Letters
سال: 2021
ISSN: ['1573-773X', '1370-4621']
DOI: https://doi.org/10.1007/s11063-021-10599-3